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ABSTRACT 


The models or techniques to assist fraud investigators, for 
efficient Credit Card Fraud Detection (CCFD), rely on 
machine learning algorithms. Proposing a _ predictive 
model for Credit Card Fraud determination is however 
mainly exigent due to the highly distributed and 
imbalanced data and the availability of only few 
transactions labelled as fraud in overall transactions. To 
seek out whether the transaction is fraud on E-commerce 
websites, is role of prediction models. To find out such 
transaction can be treated as a sort of machine learning 
(ML) problem. It is confirmed through several researches 
that use of Ensemble methods in ML certainly improves 
performance of prediction and classification tasks. This 
paper surveys fine classification and ensemble methods 
that are helpful in building model for CCFD. Further in this 
paper, a predictive model for CCFD based on ensemble 
method is proposed. The dataset from UCSD-FICO Data 
Mining competition is used for building and testing the 
model. The results obtained shows that the predictive 
model has potential in determining fraud and minimizing 
the risk in e-commerce transactions. The paper directs 
about the future research in the field. 
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1. INTRODUCTION 
he Credit Card is used as a mode of payment 
T eevee to the customer of bank or such financial 
organization. It allows buying goods or services to its 
holder. It is generally made up ofPlastic with some secret 
numbers and Cardholder's Promise to pay for these goods 


and services availed [1, 2]. In Figure 1 ‘Clearing and 
Settlement under Credit Card System’ is depicted. 
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Figure 1: Clearing and Settlement under Credit 
Card System 

2. LITERATURE REVIEW 
Machine Learning (ML) through Ensembles is an important 
technique that combines outputs from multiple individual 
classifiers for improving classification accuracy [19, 20]. 
Theoretical and experimental results suggest that 
combining classifiers can give effective improvement in 
accuracy if classifiers within an ensemble are not 
correlated with each other [21, 22]. 
3. PROPOSED WORK 
The framework proposed in this work is depicted in Figure 
2. The proposed framework for prediction works for each 
transaction and separates the transaction with high or low 
risk using the method proposed. The proposed predictive 
model can be further used to generate alerts for 
transaction with high risks. Investigators check these 
alerts and provide a feedback for each alert, i.e. true 
positive (fraud) or false positive (genuine). The proposed 
model uses suitable pre-processing, attributes selection 
techniques along with proposed Bagging EM. K-fold cross 
validation is used as ‘test split’ method. 
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Figure 2: Proposed Model for Credit Card Fraud 
Detection 


— 


4. EXPERIMENTAL SETUP, METHODOLOGY AND 
PERFORMANCE ANALYSIS 

4.1 Experimental Setup 

Weka 3.8.1 is used as DM tool for simulation purpose. 
Weka is installed over Windows 10 Operating System. For 
this research a state of art research dataset from USCD- 
FICO competition [7, 8] is used. The competition is hosted 
by: FICO a leading provider of technologies and University 


of San Diego. Dataset description is presented in Table 1. 


No. of No. of 
No. of Total 
Dataset Instances | Instances 
Features | Instances 
(Yes) (No) 
USCD- 97346 2654 
20 10,0000 
FICO 4 (97.35%) | (2.65%) 


Table 1: Details of Dataset 
4.2 Methodology 
The experiment methodology involves following steps: 

1. Implementing and integrating the proposed 
method with base classifier and building the 
proposed Classifier / model using the “training” 
dataset. 

2. Applying base classifier to build the model with 
same training dataset. 

3. The evaluation of proposed and base classifier 

on various metrics is done. 

The proposed classifier is then compared with 
benchmark classifiers. 

4.3 Performance Analysis 

The performance analysis is done on the basis of following 
metrics: 

Prediction Rate: Prediction rate refers to the percentage 
of correct predictions among all test data. 


4. 


Prediction Rate = 100 


TP+TN 


False Alarm Rate (FAR): The percentage of 
normal data which is wrongly recognized as of 
different class is FAR, and is defined as follows: 


False Alarm Rate = 100 


FP 
——____ 
FP+TN 


The performance analysis is shown in Table 2. 


Naive K-Nearest | Proposed 


Metrics Bayes Neighbour | Method 
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Prediction Rate 96.9% 96.1% 97.8% 


False Alarm Rate | 0.68 0.72 0.65 
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Table 2: Performance Analysis 


5. CONCLUSION & FUTURE WORK 

Credit Card Frauds are common as attackers 
gather information through transactions and 
registered accounts. This opens new confronts 
in the field of fraud detection and prevention, 
but prevention is of course better than 
detection. The simple techniques like database 
comparison and pattern matching are not 
enough for detecting such frauds because 
fraudulent transactions are rare within huge 
number of genuine transactions. So, Predictive 
models are of prime importance for banks to 
detect CCFs. The proposed ensemble based 
CCFD predictive model is compared with base 
learner and state-of-art model. The proposed 
work is compared on basis of two functional 
metrics: Prediction Rate and FPR proved to be 
better. The efforts shown that, Ensemble ML 
methods are more suitable for detecting frauds 
with credit cards. In future, more efforts 
methods will be worked out to improve the 
Fraud Catching Rate. At the same time proposed 
predictive model would be integrated with live 
stream to find the online fraudulent transaction 
instantly. In future we intend to build up a cloud 
based ML application for detecting frauds in 
financial transactions done with cards. 
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